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Abstract 

An identifying code in a graph is a set of vertices which intersects ail the symmetric differences 
between pairs of neighbourhoods of vertices. Not all graphs have identifying codes; those that do are 
referred to as twin-free. In this paper, we design an algorithm that finds an identifying code in a twin-free 
f™^ ' graph on n vertices in 0(n ) binary operations, and returns a failure if the graph is not twin-free. We 

also determine an alternative for sparse graphs with a running time of 0(n 2 d log n) binary operations, 



u 



where d is the maximum degree. We also prove that these algorithms can return any identifying code 
,_£^ \ with minimum cardinality, provided the vertices are correctly sorted. 

1 Introduction 
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Identifying codes were introduced in |T] for fault diagnosis in multiprocessor systems, and have since then 
^ . found applications in location and detection problems. In general, an identifying code in a graph G can 

be defined as follows. First, we denote the (closed) neighborhood of any vertex v as N(v) = {v} U {w : 
vw G E(G)}. An identifying code is a subset of vertices which satisfies the following property: for any two 
vertices v and w, we have N(v) (~l C ^ N(w) flC^fl. Equivalently, it is any subset of vertices C such that 
for all v\,V2 6 V(G), (N(vi)AN(v2)) flC^ I, where A is the symmetric difference between two sets. A 
graph admits an identifying code if and only if it is twin-free [2] , where twins are two vertices with the same 
neighborhood. We remark that the definitions above are commonly used for a so-called 1-identifying code, 
where an r-idcntifying code is defined in terms of balls of radius r around a vertex. Since any r-idcntifying 
code can be seen as a 1-identifying code for a related graph, we do not lose any generality in considering 
1-identifying codes only. For a thorough survey of identifying codes, the reader is invited to [5J, and an 
exhaustive literature bibliography on identifying codes and related topics is maintained in [1] . 

Since any superset of an identifying code is itself an identifying code, it is natural to search for the 
minimum cardinality i{G) of an identifying code of a given graph G. Let us refer to an identifying code as 
minimal if it has no proper subset which itself is an identifying code and as minimum if it has the smallest 
cardinality amongst all codes. The problem of finding the minimum cardinality of an identifying code was 
shown to be NP-hard in 3 . Viewing this problem as an instance of the subset cover problem [5] , a greedy 
heuristic was also designed and analyzed in 3J. Its running time is on the order of 0(n ) binary operations 
and has the following performance guarantees. It always finds an identifying code whose cardinality is less 
than c\i(G) Inn for some nonnegative constant c\\ however, there are graphs for which the algorithm always 
returns a code with cardinality greater than C2«(G) Inn for another nonnegative constant c-i- 

Lexicographic codes were introduced in [5] and independently rediscovered in [7] to design large constant- 
weight codes, which are sets of binary vectors of equal Hamming weight with a prescribed minimum Hamming 
distance (see [B: for a detailed review of constant- weight codes and lexicographic codes). The principle is 
to first sort all the vectors with the same Hamming weight, and then construct the code as we run through 
them. Adding a codeword is done according to a simple criterion: it must be at distance at least d from 
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the code constructed so far. The performance of the algorithm depends on the order in which the vectors 
have been sorted; moreover, some modifications can be added, such as starting with a predetermined set of 
vectors. Many record-holding constant-weight codes have been designed using lexicographic codes. However, 
this idea is not limited to constant-weight codes, and their application to nonrestricted binary codes has 
led to many interesting results |5]. They have also been recently applied to the construction of codes on 
subspaces in [TU], also yielding record- holding codes. 

In this paper, we investigate adapting the idea of lexicographic codes to identifying codes. The main 
contribution is an algorithm running in 0{n 3 ) binary operations which returns an identifying code for a 
twin-free graph, and returns a failure if the graph is not twin-free. This algorithm is then adapted to sparse 
graphs to run in 0(n 2 d\ogn) binary operations. Both algorithms have the same guarantees in terms of 
cardinality of the output. Although we are unable to give an upper bound which does not depend on the 
ordering of the vertices, we show that provided the vertices are properly sorted, the algorithm returns a 
minimum identifying code. This is fundamentally different to the greedy approach in 0(n 4 ). 

2 Algorithm for general graphs 

2.1 Description and correctness 

Let G be a graph on n vertices with adjacency matrix A, and let B = I„ + A. We denote the vertices as 
v\, V2, ■ ■ ■ ,v n , thus hi j — 1 if and only if Vi € N(vj); yet we shall abuse notation and identify a vertex with 
its index. For instance, we refer to the vertex with minimum index in the neighborhood of Vi as minl(i). 
Also, the output of our algorithm is actually the set of indices of the vertices in the code. 

Before giving the pseudocode of Algorithm [1] we describe it schematically below. Its input is the matrix 
B of the graph. It then runs along all vertices Vj, adding a new codeword to the code C if N(vj) fl C = 
or N(vj) n C = N(vk) H C for some k < j. While searching for a new codeword to add, the algorithm 
may return a failure if the graph is not twin-free, which we identify as n + 1 e C. After the j'-th step, the 
code C then 'identifies' the first j vertices, i.e. they are all covered in a distinct fashion. We keep track of 
the intersections N(vi) fl C in a matrix X. After going through all vertices, the algorithm then returns an 
identifying code C or a failure (if n + 1 6 C) if the graph is not twin-free. 

The subroutine min2(j, k) returns the first vertex which identifies Vj if it exists and a failure otherwise, 
i.e. it determines the first vertex in lexicographic order in N(vj)A.N(vk)- If N(vj) = N{yk), then it returns 
n+1. It is given in Algorithm [2] 

We now justify this claim in Lemma [1] below. 

Lemma 1 The subroutine min2(j , k) returns the minimum element in N(vj)AN(vk) if this symmetric dif- 
ference is non-empty, and a failure (I = n + 1) otherwise. 

Proof First, if N(vj) = N(v k ), then B(j,l) = B(k,l) for all 1 < I < n. Therefore, the while loop 
will only stop once I = n + 1, and hence the subroutine returns a failure. Second, if N(vj) ^ N(vk), then 
the minimum element in N(vj)AN(vk) is the smallest I such that B(j,l) ^ B(k,l). It is clear that the 
subroutine returns this value. □ 

Proposition 1 Algorithm^ returns an identifying code if the input graph is twin-free, and a failure (n+l E 
C ) otherwise. 

Proof First of all, we prove that the algorithm returns a failure if and only if the graph is not twin-free. 
In the latter case, let k be the smallest integer such that the set {i ^ k : N(vk) — N(vi)} is not empty, 
and let j be the minimum element of this set (hence k < j, N(vk) = N(vj)). It is easily shown that after 
the fc-th step, Vk is covered. On the j'-th step, Algorithm [T] first checks if Vj is covered. Since v^ is covered 
and N(vk) = N(vj), then Vj is also covered. Algorithm [T] then finds that k is the smallest integer satisfying 
X(fc) = X(j), and hence calls the subroutine min2(j, k). By Lemma Q] this returns a failure, and hence the 
whole algorithm returns a failure. Conversely, the only case where the subroutine (and hence the algorithm) 
returns a failure is when there exist k < j such that N(vj) = N(vk), i-e. the graph is not twin-free. 



Algorithm 1 Main algorithm for general graphs 



C <- 0, X <- 0„, .? <- 1 
while j < n and n + 1 ^ C do 
I ^0 
if X(j) = then {vj is not covered} 

I <r- minl(j) 
else 
fc-e 1 
while X(j) ^ X(fc) and fc < j do {t>j is covered, so we search if it is identified} 

k <- fc + 1 
end while 
if fc < j then {wj is not identified} 

Z <— min2(j, fc) 
end if 
end if 

if 1 < I < n then {A new codeword has been found} 
C^CU{Z} 
X r (0<-B r (*) 
end if 

.?' <~ 3 + 1 
end while 
return C 



Algorithm 2 min2( j, k) subroutine 



Z<- 1 

while I < n and B(j, Z) = B(fc, Z) do 

Z ^Z + l 
end while 
return Z 



We now assume that the graph is twin-free, and hence we have I < n at any step. We need to show that 
the output C of Algorithm [T] is an identifying code. Let us denote the matrix X and the code C obtained 
after j steps as X J as C 3 , respectively. Note that for all a, X J (a) reflects how the vertex v a is covered by 
C 3 : N(v a ) n C 3 = supp(X(a)) = {b : X J (a, b) = 1}. The following claim is the cornerstone of the proof. 

Claim: After step j, all X J '(i)'s are nonzero and distinct for 1 < i < j. 

The proof goes by induction on j, and is trivial for j ' = 1. Suppose it is true for j ' — 1, then 

supp(X^ 1 (a)) = N{v a ) n C 3 - 1 C N{v a ) D C 3 = supp(X J (a)). (1) 

It is hence easy to show that if X^ 1 (a) ^ 0, then X 3 ' (a) ^ and if X^ 1 (a) ^ X^ 1 (6), then X^ (a) ^ X J ' (6) 
for all a and 6. It immediately follows that the vectors X J (i)'s are all nonzero and distinct for 1 < i < j ' — 1, 
and we only have to consider X 3 (j). Three cases occur when the algorithm reaches step j. 

• Case I: X J_1 (j) is nonzero and distinct to any X-' _1 (j) for 1 < i < j — 1. Then as shown above, X-'(j) 
is nonzero and distinct to all X- 7 (z)'s. 

• Case II: X J_1 (j) is nonzero and equal to X- 7_1 (fc) for some k < j. First, we remark that k is 
unique, as X- 7_1 (fc) ^ X- 7_1 (i) for all other i. The min2(fc,j) subroutine then returns an element 
vi S N(vj)AN(v k ), and hence X^'(j,/) ^ X 3 (k,l). 

• Case III: X^ 1 ^) = 0. Then by hypothesis X^ 1 ^) ^ X^^i) for all 1 < i < j - 1, and hence 
X-'(i) ^ X J (i). Also, X J (j) is the unit vector e minl ( :) '), which is nonzero. 

Therefore, for the code C" = C obtained after n steps, N(v a ) D C are all nonzero and distinct for all 
1 < a < n. It is hence an identifying code. □ 

2.2 Performance 

We now investigate the performance of Algorithm [T] We are first interested in the cardinality of its output. 
Clearly, this depends on the order in which the vertices are sorted. We show below that provided the order 
is suitable, the algorithm can find any minimal identifying code, and hence can return a minimum one. 

Proposition 2 Suppose that the graph is twin-free and that M = {vi, w 2j ■ • • > v m} forms an identifying code. 
Then Algorithm]]] returns an identifying code that is a subset of M . 

Proof We know by Proposition Q] that the algorithm returns an identifying code; we only have to prove 
that all codewords are in M. At step j, three cases need to be distinguished. 

• Case I: Vj is covered and identified, then no codeword is added. 

• Case II: Vj is covered but not identified, i.e. (N(vj)AN(vk)) H C 3 ^ 1 = for some k < j. The 
subroutine returns the smallest element vi in N(vj)AN(vk). Since M is an identifying code, the set 
(N(vj)AN(vk)) n M is not empty, hence vi <E M. 

• Case III: Vj is not covered. The algorithm then selects the next codeword to be minl(j'), which is 
necessarily in M as N(vj) fl M ^ 0. 

Therefore, the algorithm only adds codewords of M, and hence returns a subcode of M. □ We remark that 
Algorithm [T] does not necessarily return a minimal code, as seen in Figure [TJ Algorithm [1] would return the 
code {1, 2, 3, 4, 5, 6} while {2, 3, 4, 5, 6} is a minimal identifying code. 

On the other hand, if M is minimal, then it has no proper subset that itself is an identifying code; 
Algorithm Q] thus returns it. We obtain the following corollary. 

Corollary 1 Provided that the vertices are sorted such that v% , V2, . . . , v m form a minimal identifying code 
for some 1 < m < n, Algorithm^ will return this identifying code. 




Figure 1: A graph and a sorting of vertices such that the lexicographic code is not minimal 

Proposition [5] also implies that the probability that the output has cardinality no more than K is at least 
the probability that the first K vertices form an identifying code. Hence our algorithm returns a minimum 
identifying code with probability at least , \ , . 

U(G)J 

Proposition 3 The running time of Algorithm]]} is 0(n 3 ) binary operations. 

Proof Clearly, we have to run the iteration for j exactly n times. For each iteration, the step demanding 
the highest number of operations is the search for k. We consider at most j — 1 values of k, comparing at 
most n bits to verify whether X(j) ^ X(fc). Therefore, the running time is 0(n 3 ). □ 

3 Algorithm for sparse graphs 

For sparse graphs, it is more efficient not to work with the whole adjacency matrix, but with the neighborhood 
array A G P(E) n , defined as A(vi) = N(vi), where the neighborhood is sorted in increasing lexicographic 
order. Then, instead of adding the column of the adjacency matrix corresponding to a new codeword, we 
only update the code array X(v) for all vertices adjacent to the new codeword. THe algorithm for sparse 
graphs is given in Algorithm [3j its input is the neighorhood array, and it returns an identifying code C or a 
failure (n + 1 € C) if the graph is not twin-free. 

Similar to the general case, the min3(j, k) subroutine produces the first vertex Vi which identifies Vj if it 
exists and a failure otherwise, i.e. it determines the first vertex in lexicographic order which covers either j 
or k, but not both. It is given in Algorithm 01 

The same results on correctness and the possibility of returning a minimum code also hold for Algorithm 
[3J they are summarized below. 

Proposition 4 If the graph is not twin-free, then Algorithm [5| returns a failure. Otherwise, the algorithm 
returns an identifying code contained in {vi,V2, ■ ■ ■ ,v m }, where m is the minimum integer such that this 
forms an identifying code. 

The running time of Algorithm^ is 0(n 2 d\ogn) binary operations. 

Proof The proof of correctness of Algorithm [3] is similar to that of Algorithm [T] and is hence omitted. 
We hence determine the running time of the algorithm. □ 
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